Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique
Identifieur interne : 000778 ( Main/Exploration ); précédent : 000777; suivant : 000779Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique
Auteurs : Apurva A. Desai [Inde]Source :
Descripteurs français
- Pascal (Inist)
- Wicri :
- topic : Classification.
English descriptors
- KwdEn :
Abstract
Lot of work has been done for Optical Character Recognition (OCR) for various Indian languages. But for Gujarati, a language belonging to Devnagari family of languages and spoken in the western state of Gujarat in India, hardly any work can be traced especially for handwritten characters. In this work I have addressed the problem of handwritten Gujarati numerals. Here pre-process techniques like global threshold, erosion and dilation, skew correction etc, are used. A novel hybrid feature extraction technique is suggested and used which is constituted by a structural approach and statistical approach of feature extraction. Image is subdivided and then the pixel information is used as a structural approach whereas the aspect ratio of the number is considered as a statistical approach. For classification kNN classifier has been used. This model gives overall accuracy of 96.99% for the handwritten Gujarati numerals.
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream PascalFrancis, to step Corpus: 000089
- to stream PascalFrancis, to step Curation: 000683
- to stream PascalFrancis, to step Checkpoint: 000152
- to stream Main, to step Merge: 000783
- to stream Main, to step Curation: 000778
Le document en format XML
<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique</title>
<author><name sortKey="Desai, Apurva A" sort="Desai, Apurva A" uniqKey="Desai A" first="Apurva A." last="Desai">Apurva A. Desai</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Department of Computer Science, Veer Narmad South Gujarat University, Udhna Magdalla Road</s1>
<s2>Surat - 395007, Gujarat</s2>
<s3>IND</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Inde</country>
<wicri:noRegion>Surat - 395007, Gujarat</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">12-0306480</idno>
<date when="2010">2010</date>
<idno type="stanalyst">PASCAL 12-0306480 INIST</idno>
<idno type="RBID">Pascal:12-0306480</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000089</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000683</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000152</idno>
<idno type="wicri:Area/Main/Merge">000783</idno>
<idno type="wicri:Area/Main/Curation">000778</idno>
<idno type="wicri:Area/Main/Exploration">000778</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique</title>
<author><name sortKey="Desai, Apurva A" sort="Desai, Apurva A" uniqKey="Desai A" first="Apurva A." last="Desai">Apurva A. Desai</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Department of Computer Science, Veer Narmad South Gujarat University, Udhna Magdalla Road</s1>
<s2>Surat - 395007, Gujarat</s2>
<s3>IND</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Inde</country>
<wicri:noRegion>Surat - 395007, Gujarat</wicri:noRegion>
</affiliation>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Aspect ratio</term>
<term>Character recognition</term>
<term>Classification</term>
<term>Feature extraction</term>
<term>Language family</term>
<term>Manuscript character</term>
<term>Mathematical morphology</term>
<term>Modeling</term>
<term>Natural language</term>
<term>Optical character recognition</term>
<term>Pattern extraction</term>
<term>Pattern recognition</term>
<term>Statistical analysis</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance caractère</term>
<term>Reconnaissance forme</term>
<term>Caractère manuscrit</term>
<term>Reconnaissance optique caractère</term>
<term>Famille langage</term>
<term>Langage naturel</term>
<term>Morphologie mathématique</term>
<term>Classification</term>
<term>Extraction caractéristique</term>
<term>Rapport aspect</term>
<term>Extraction forme</term>
<term>Analyse statistique</term>
<term>Modélisation</term>
<term>.</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Lot of work has been done for Optical Character Recognition (OCR) for various Indian languages. But for Gujarati, a language belonging to Devnagari family of languages and spoken in the western state of Gujarat in India, hardly any work can be traced especially for handwritten characters. In this work I have addressed the problem of handwritten Gujarati numerals. Here pre-process techniques like global threshold, erosion and dilation, skew correction etc, are used. A novel hybrid feature extraction technique is suggested and used which is constituted by a structural approach and statistical approach of feature extraction. Image is subdivided and then the pixel information is used as a structural approach whereas the aspect ratio of the number is considered as a statistical approach. For classification kNN classifier has been used. This model gives overall accuracy of 96.99% for the handwritten Gujarati numerals.</div>
</front>
</TEI>
<affiliations><list><country><li>Inde</li>
</country>
</list>
<tree><country name="Inde"><noRegion><name sortKey="Desai, Apurva A" sort="Desai, Apurva A" uniqKey="Desai A" first="Apurva A." last="Desai">Apurva A. Desai</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000778 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000778 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= Pascal:12-0306480 |texte= Handwritten Gujarati Numeral Optical Character Recognition using Hybrid Feature Extraction Technique }}
This area was generated with Dilib version V0.6.32. |